Dialect maps and dialect research; useful tools for automatic speech recognition?

نویسندگان

  • Arne Kjell Foldvik
  • Knut Kvale
چکیده

Traditional dialect maps are based on data from carefully selected informants which usually results in clear-cut dialect borders, isoglosses, with one dialect characteristic present on one side of the isogloss and absent on the other. We illustrate some of the problems and pitfalls connected with using dialect maps for ASR by comparing results from traditional dialect research with investigations of the Norwegian part of the European SpeechDat database, centred on the two main types of /r/ pronunciation. Our analysis shows that traditional dialect maps and surveys may be of limited use in ASR. To what extent the Norwegian findings have parallels in other countries will depend on two main factors, dialect allegiance vs. a national standard pronunciation and the extent to which the population is sedentary or mobile. Results from traditional dialect research may therefore be more useful in ASR of other languages than Norwegian.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Speech Recognition for Tunisian Dialect

Speech recognition for under-resourced languages represents an active field of research during the past decade. The tunisian arabic dialect has been chosen as a typical example for an under-resourced Arabic dialect. We propose, in this paper, our first steps to build an automatic speech recognition system for Tunisian dialect. Several Acoustic Models have been trained using HMM-GMM and HMM-DNN ...

متن کامل

Automatic Dialect and Accent Recognition and its Application to Speech Recognition

Automatic Dialect and Accent Recognition and its Application to Speech Recognition

متن کامل

Gaussian Mixture Selection and Data Selection for Unsupervised Spanish Dialect Classification

Automatic dialect classification has gained interests in the field of speech research because it is important to characterize speaker traits and to estimate knowledge that could improve integrated speech technology (e.g., speech recognition, speaker recognition). This study addresses novel advances in unsupervised spontaneous Latin American Spanish dialect classification. The problem considers ...

متن کامل

Parallel Speech Corpora of Japanese Dialects

Clean speech data is necessary for spoken language processing, however, there is no public Japanese dialect corpus collected for speech processing. Parallel speech corpora of dialect are also important because real dialect affects each other, however, the existing data only includes noisy speech data of dialects and their translation in common language. In this paper, we collected parallel spee...

متن کامل

Multi-Dialectical Languages Effect on Speech Recognition

Research has shown that automatic speech recognition (ASR) performance typically decreases when evaluated on a dialectal variation of the same language that was not used for training its models. Similarly, models simultaneously trained on a group of dialects tend to underperform when compared to dialect-specific models. When trying to decide which dialect-specific model (recognizer) to use to d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998